SEARCH'97 White Paper

A New Personal and Enterprise Application
by Philippe Courtot

Contents:
Information Everywhere
Connecting People With Information
Search Reaches a New Level
Mining the "Corporate Memory"
SEARCH'97 Availability


INFORMATION EVERYWHERE

In the past decade, the proliferation of personal computers and distributed servers has dramatically changed the way electronic information is created and used within the enterprise. Unlike the centralized mainframe era in which back office systems held the corporation's entire information store, in the distributed computing era, individual users and work-groups are empowered to dynamically create documents and manage information to meet their specific information needs. Likewise, corporations have created and managed more corporate-wide information than ever before. As a result, there are massive amounts of vital information that need to be shared, stored and retrieved across the corporation. Indeed, Meta Group estimates that today the amount of private information stored globally doubles every 12-14 months. Today this information resides in a variety of distributed data stores, from relational databases and corporate application systems, to personal computers and user productivity applications.

The Worldwide Web has yet again revolutionized the way we view and use information. Across the Web, public information services proliferate daily, offering immediate access to everything from late-breaking news to advice on how to start your own business. Within the past twelve months, the Web has evolved from an electronic mail backbone to a dynamic information resource for corporations, enabling real-time competitive research and analysis, more effective customer and vendor communications and ever-expanding electronic product promotions and sales.

With information in all forms and types becoming available at accelerating rates, the ability to find the right information with minimal effort is of paramount importance to ensure the continued productivity of corporate users and the continued competitiveness of the organization.

For Internet users, search engines have proliferated across the Web to enhance people's ability to locate and collect external information concerning their specific needs. However, these search facilities fall far short of expectations for accuracy and relevancy of returned information. Who has not searched the Internet regarding a subject only to be deluged with thousands of documents completely irrelevant to their actual need?

The amount of time spent developing the right query or sifting through the myriad of returned information (surf cost) could be used much more productively. Yet as the Web continues to expand, so will the problem of information retrieval.

The problems of locating information on the Internet are compounded within the enterprise or Intranet by dissimilar document types, incompatible information sources and geographically dispersed data stores. Today's corporation must find a mechanism to provide a corporate-wide "memory" of available information which can be easily accessed by its globally distributed users. To create this "corporate memory", disparate sources of information must be connected and easy access and navigation must be provided.

This "corporate memory" will enable organizations to automatically catalog documents and data stored across personal computers, work-groups and enterprise computing resources, making them transparently available to all information consumers.

Organizations require the ability to automatically organize and filter documents within this corporate memory, dynamically publishing information which has been specifically selected to meet the information needs of diverse groups, based on unique requirements.

Individual users must be able to access this corporate memory quickly, finding necessary information from a variety of resources, while customizing the search application to automatically deliver information which meets their specific areas of interest.

CONNECTING PEOPLE WITH INFORMATION

SEARCH'97 from Verity provides just such a solution.

With the introduction of SEARCH'97, Verity has elevated the concept of search from a simple, rudimentary tool to a robust enterprise-wide application. Powerful enough to locate a single document across hundreds of corporate sites, yet simple enough to help a novice user locate a letter on his PC.

SEARCH'97 has been designed as a search application which will allow corporations as well as individuals to effectively "push" and "pull" information based on individual information needs. SEARCH'97 is an application which can immediately catalog, organize and publish any information according to specific profile of interest, regardless of its source. SEARCH'97 will connect people with the information they need quickly and easily, enhancing productivity and performance.

SEARCH'97 automatically creates a "corporate memory" from all available information in the enterprise. SEARCH'97's "corporate memory" can incorporates information from over 100 applications and databases, and is in use today by over 500 large companies worldwide. Applications such as SAP, Lotus Notes, PC DOCS and Documentum, as well as leading relational databases, Informix, Sybase, ODI and Objectivity are being delivered to customers using the Verity indexing format, ready to interact with SEARCH'97.

SEARCH'97 uses this common indexing format to join previously disparate corporate information into a single, integrated information resource.

SEARCH'97 leverages Verity's proven search technologies to bring the most advanced search capabilities to the commercial marketplace. The original search technology was used in the late '80s by information specialists within large corporations and government agencies for critical information mining.

Today, this sophisticated search capability has been extended to become SEARCH'97, providing easy-to-use yet highly intelligent search components which:

SEARCH'97 is a foundation for evolving search capabilities far beyond today's macro or document-based concepts.

Through SEARCH'97's unique architecture and functionality, individuals and organizations alike will soon be able to mine their "corporate memory" at a discrete micro level, intelligently searching for content-based relationships and unique associations within the information. SEARCH'97 sets the stage to enable users to intelligently ask for known and unknown information, to become the means for corporations to differentiate themselves in today's competitive marketplace.

SEARCH REACHES A NEW LEVEL

SEARCH'97 is more than a collection of search functionality. It is a powerful and flexible platform for deploying search applications across the enterprise. It is the industry's first search application platform, providing the necessary infrastructure and components to create powerful information retrieval systems. Advanced, intelligent search components can be flexibly combined and brought into play to extend SEARCH'97's core functionality, based on the specific requirements of the corporation or individual. All search application functionality is made available to users via a single user interface.

The SEARCH'97 applications framework is designed to enable corporations to easily develop and deploy information retrieval applications which can be transparently integrated into the SEARCH'97 platform. This plug-and-play design allows additional functions and components to be added at any time.

Because all SEARCH'97 applications are integrated within the same framework, the SEARCH'97 architecture can be enhanced and extended as applications and installations evolve. New intelligent search modules can be developed by Verity, third parties or corporate developers, and transparently integrated within SEARCH'97's architecture to extend the application functionality based on specific requirements.

In this initial release of SEARCH'97, Verity is announcing a number of search components which extend functionality, including:

SEARCH'97 implements a 3-tiered information search architecture, enabling search application processing to be distributed based on the requirements of the organization.

Local search capabilities are implemented within an individual personal computer, while enterprise-wide information publishing may reside on a centralized IT platform. This 3- tiered architecture, combined with the component framework of SEARCH'97 enables user organizations to flexibly package and deploy appropriate search applications.

Search functionality can be easily extended or enhanced as organizations change and grow with no need to re-implement the entire search application. SEARCH'97 delivers a single search application which can scale to meet the needs of every information consumer, from individual workers to work-groups and sophisticated corporate enterprises.

SEARCH'97 provides a single user interface to search disparate sources of information. Using SEARCH'97 PERSONAL, users can immediately index and locate any information that is available on their local drives, flexibly viewing that information without actually launching the application which created it.

SEARCH'97 PERSONAL further provides the user with an individual window into the power of SEARCH'97 across the enterprise and the Internet. It provides the interface for users to create queries, search agents and implement searches across SEARCH'97's "corporate memory". SEARCH'97 PERSONAL is supported from within Netscape's Web Browser, Microsoft's Internet Explorer or Microsoft Exchange. SEARCH'97 PERSONAL can view over 200 file types and users can view information as it was created even when the originating application is not locally available.

For Internet users, SEARCH'97 PERSONAL will enable personal Internet indexing, a feature which creates and stores indexes of all web pages browsed by a user. Personal Internet indexes can be created automatically as the user browses or at its discretion.

At the core of the SEARCH'97 architecture is the Information Server. This powerful search engine creates and manages the information catalogs that comprise the "corporate memory". Through the Information Server, users can access and mine this catalog of information using their SEARCH'97 interface or any standard Web browser. The Information Server provides the key integration point for additional search components such as the Agent Server and Information Gateways.

The Information Server also includes an efficient web spider which automatically catalogs web server information throughout the organization. These catalogs can be added to the overall "corporate memory" and dynamically updated based on specific corporate policy. The Information Server administrations forms and point and click set-up make it very easy to install and administer.


SEARCH'97 Agent Server is an advanced server component which enables individuals and corporations to create, maintain and deploy intelligent agents which automatically search the "corporate memory" for specific information.

Agents represent specific queries, or sets of queries, which make up a particular information profile. An almost unlimited number of corporate or individual agents can be deployed, representing a diverse array of information profiles.

Agents continuously monitor SEARCH'97's "corporate memory" for new or changed information which matches the stored information profile. When information that matches the query is observed, the agents automatically carry out the defined Agent action or notification. Agents can notify users of new information via a variety of methods, including a customizable web page, electronic mail or pager. In addition to monitoring information sources, application agents can be used to categorize and rank information based on defined categorization or ranking criteria. Agents can finally also be directed to watch for changes in relational databases and notify users or applications.

SEARCH'97 includes a number of optional intelligent search components, which enhance the effectiveness and accuracy of the search application.

These components can be integrated with the core SEARCH'97 Information Server based on individual corporate requirements. SEARCH'97's intelligent search components include functionalities for Enhanced Query, Visualization and Knowledge and Navigation Tools.

Enhanced Query extends the SEARCH'97 portfolio of query technologies, including NLP, QBE and Topics. Enhanced query techniques make SEARCH'97 easier to use, while increasing the accuracy of results. Using NLP (Natural Language Processing), the query engine recognizes sets of related words as phrases instead of as unique words by making use of noun phrase extraction. QBE (Query By Example) allows users to refine queries by using examples of relevant information to form new search parameters.

Visualization components enhance the representation of information, making it easier for users to immediately identify relevant documents.

Clustering automatically organizes information returned from a search into clusters based on commonality of information. Users can search through clusters to find relevant information without having to examine every individual document.

Through summarization, SEARCH'97 intelligently creates content summaries of individual documents, based on the most significant sentences contained within the document. This provides a much better overview of document content than traditional summaries which contain only the document title or first few lines. All returned information is relevance ranked, using advanced ranking technologies.

Knowledge Tools provide simple mechanisms for corporations to create specialized subsets across standardized definitions or topics. Using knowledge tools, administrators can create customized dictionaries and thesaurus, based on the specific environment of the organizations' business, for example, specialized capabilities for legal firms. Administrators can create specialized topic editors for distribution and use across the organization.

Navigation tools enhance the user's ability to search throughout the SEARCH'97 "corporate memory" and beyond. Navigation can include moving through documents via hyperlinks, as well as navigation by individual categories.

SEARCH '97 is available on a wide array of platforms including Windows, Macintosh and popular UNIX processors. SEARCH'97 supports also the Microsoft Exchange platform, allowing users to easily search across their mail, public and private folders from the Microsoft Exchange user interface and to also conduct enterprise or Internet searches.

Verity also provides a Developer's Kit for SEARCH'97 for its search and indexing engine and for the Agent Server as well, to allow easy integration of third party applications.

Verity expects 3rd parties to deliver additional search capabilities based on SEARCH'97 in the near term.

MINING THE "CORPORATE MEMORY" IS THE NEXT IMPORTANT APPLICATION

Once all sources of information have been adequately indexed, there is now an opportunity to mine the virtual repository of information or "Corporate Memory" which has been created. For example, you may want to be interested in the relationships that your company has created with another, not information about that other company. What you want, therefore, is to go from a simple query to the information. Verity is currently working at adding this sort of capabilities to its SEARCH'97 application by developing fact extraction engines, user definable categories and a fast relational engine to quickly match relevant relationships.

SEARCH'97 AVAILABILITY

Major key components of Verity SEARCHÆ97 platform are now shipping.


Corporate
Information
SEARCH'97
Family of Products